NWL  REPORT  NO.  1833 


TABLE  OF  EXPECTATIONS 
OF  MEAN  SQUARES  IN  THE  ANALYSIS  OF  VARIANCE 
FOR  CROSSED  CLASSIFICATIONS 


BY 

KLAUS  ABT 

COMPUTATION  AND  ANALYSIS  LABORATORY 


U.  S.  NAVAL  WEAPONS  LABORATORY 
DAHLGREN,  VIRGINIA 


DATE:  2  APRIL  1963 


U.  S.  Naval  Weapons  Laboratory 
Dahlgren,  Virginia 


Table  of  Expectations 
of  Mean  Squares  in  the  Analysis  of  Variance 
for  Crossed  Classifications 


by 

Klaus  Abt 

Computation  and  Analysis  Laboratory 


NWL  REPORT  NO.  1833 
Foundational  Research  Project  K 12007/29C 
18  October  1962 


NWL  REPORT  NO.  1833 


TABLE  OF  CONTENTS 

Page 

Abstract .  ii 

Foreword .  iii 

I.  Introduction .  1 

D.  Description  and  use  of  the  table .  2 

1.  Notation  and  linear  model .  2 

2.  Scope  of  the  table .  4 

3.  Discussion  of  mean  square  expectation  formulas .  7 

4.  Approximate  F -tests  and  the  estimation  of  variance  components  .  .  10 

5.  Generalization  for  the  n-way  crossed  classification .  12 

6.  The  case  of  orthogonal  contrasts  in  fixed  factors .  14 

III.  References .  16 

IV.  Table  of  mean  square  expectations  and  variance  ratios  in  the  analysis 
of  variance  for: 

1.  The  two-way  crossed  classification .  18 

2.  The  three-way  crossed  classification .  27 

3.  The  four-way  crossed  classification .  44 

Appendices 

A.  Description  of  method  used  for  the  derivation  of  mean  square  expecta¬ 
tions 


B.  Proportionality  conditions  for  cell  numbers  in  crossed  classifications 

C.  Distribution 


NIL  REPORT  NO.  1833 


A3STRACT 

This  report  contains  a  table  o f  the  expectations  of  mean  squares  in  the  analysis  of 
variance  for  crossed  classifications  and  the  description  of  the  table.  In  an  appendix  the 
method  is  outlined  which  was  used  to  obtain  the  mean  square  expectations. 

The  table  covers  the  two-way,  the  three-way  and  the  four-way  classification  with 
unequal  but  proportional  as  well  as  with  equal  cell  numbers.  Tbe  mean  square  expectations 
(and  the  appropriate  variance  ratios  for  testing  all  nnllhypotheses)  are  listed  for  all  pos¬ 
sible  combinations  of  fixed  and  random  effects  classifications.  Also,  roles  are  given  for 
generating  the  mean  square  expectations  for  all  situations  in  the  general  case  of  an  n-way 
crossed  classification. 


NIL  REPORT  NO.  1833 


FOREWORD 

The  work  covered  by  this  report  was  performed  as  part  of  Foundational  Research 
Project  No.  K12007/29C,  “Table  of  Expectations  of  Mean  Squares  in  the  Analysis  of 
Variance  (General).”  The  date  of  completion  was  18  October  1962. 


APPROVED  FOR  RELEASE: 


/S/  R.  H.  LYDDANE 
Technical  Director 
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I.  Introduction 

In  the  application  of  analysis  of  variance  techniques,  the  expectations  of  mean 
squares  serve  two  essential  purposes:  (1)  They  determine  the  appropriate  E-value  for 
exactly  or  approximately  testing  a  stated  nullhypothesis  and  (2)  they  indicate  how  to 
estimate  a  given  variance  component.  P'or  the  conceptually  simplest  and,  at  the  same 
time,  most  frequently  occurring  classification  of  statistical  data,  the  crossed  classifica¬ 
tion  (i.e.,  a  classification  into  rows,  columns,  slices,  etc.,  by  logical  reasoning),  the 
expectations  of  mean  squares  in  the  analysis  of  variance  are  commonly  known  provided 
that  there  are  equal  numbers  of  replicated  observations  in  all  “cells”  formed  by  the 
classification  and  that  the  underlying  model  is  either  of  the  “fixed  effects”  type  (“Model 
I”)  or  of  the  “random  effects”  type  (“Model  II”).  Here  the  terms  “fixed  effects”  and 
“random  effects”  relate  to  the  classical  definitions  by  Eisenhart  (1947),  i.e.,  in  the 
linear  model,  the  effects  of  a  given  way  of  classification  (the  row  effects,  for  example) 
upon  the  variable  under  investigation  are  conceived  to  be  fixed  (non-random)  or  to  have 
been  randomly  sampled  from  parent  infinite  populations,  respectively. 

If  one  deals,  however,  with  a  three-  or  more-way  classification  with  an  underlying 
model  of  the  “mixed  effects”  type,  say,  (i.e.,  with  a  model  containing  both  fixed  and 
random  effects  classifications)  it  requires,  even  in  the  case  of  “equal  cell  numbers”, 
some  time  to  find  in  the  literature  —  if  it  is  at  all  possible  —  ready  formulas  of  mean 
square  expectations.  Some  of  these  formulas  are  given,  for  example,  by  Anderson  and 
Bancroft  (1952),  Hennett  and  Franklin  (1954),  Cornfield  and  Tukey  (1956)  and  Brownlee 
(1960).  One  reason  for  the  said  rarity  seems  to  be  the  fact  that  various  models  and 
methods  of  different  degrees  of  complexity  have  been  proposed  and  applied  to  this  case  of 
mixed  effects.  This  necessarily  led  to  differing  mean  square  expectations  and  thereby, 
understandably,  to  some  confusion.  The-.e  facts  are  reviewed  in  the  very  lucid  summary 
paper  by  Plackett  (1960).  In  the  discussion  of  Plackett’s  paper  voices  also  arose  that 
the  theoreticians  of  analysis  of  variance  should  keep  closer  to  the  needs  of  the  practice 
rather  than  construct  highly  complicated  models  which  are  theoretically  right  but  practical¬ 
ly  of  not  too  much  use. 

In  the  most  general  orthogonal  case  of  unequal  but  proportional  cell  numbers  (in 
crossed  classifications,  see  Appendix  R)  few  attempts  have  been  made  to  develop  general 
formulas  which  would  be  applicable  to  all  models;  see  the  papers  of  Wilk  and  Kempthorne 
(1956)  and  of  -lankier  and  Walpole  (1957).  These  authors  use  the  method  of  sampling  from 
finite  populations  which  seemed  to  be  the  only  one  capable  of  handling  the  derivations  of 
mean  square  expectations  in  this  case  of  proportional  cell  numbers.  The  method,  which 
also  in  case  of  equal  cell  numbers  is  applied  by  various  authors,  see  for  example  Bennett 
and  Franklin  (1954)  and  Cornfield  and  Tukey  (1956),  is  extremely  cumbersome,  however. 
Moreover,  in  case  of  proportional  cell  numbers,  it  seems  as  if  this  method  leads  to 
questionable  expectations  of  mean  squares  for.  the  fixed  and  the  mixed  effects  models 
(see  paragraph  II.  3,  below). 

Stimulated  by  these  facts  the  author  of  this  report  felt  that  for  the  practitioner  of 
analysis  of  variance  a  unified  and  simplified  underlying  model  and  method  for  the  deriva¬ 
tion  of  mean  square  expectations  should  be  employed  and  the  expectations  themselves  be 
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tabulated,  la  particular,  it  waa  thought  that  the  model,  the  method  and  the  table  should 
also  cover  the  moat  general  orthogonal  cases  of  unequal  but  proportional  cell  numbers. 

This  aim  was  accomplished,  it  is  believed,  for  n-way  crossed  classifications  in 
cases  where  the  underlying  model  is  of  the  fixed,  random  or  mixed  effects  type.  The  re¬ 
sults  ore  presented  in  this  report.  Section  IV  contains  the  formulas  for  the  expectations 
of  mean  squares  (and  the  appropriate  variance  ratios  for  testing  all  nullhypotheses)  in  the 
analysis  of  variance  for  the  two-way,  three-way  and  four-way  classification,  for  all  three 
types  of  models  and  for  equal  as  well  as  unequal  but  proportional  cell  numbers.  The 
method  used  for  the  derivations  is  described  in  Appendix  A.  This  method  is  general,  how¬ 
ever,  and  can  be  applied  also  to  other  than  crossed  classifications.  Otherwise  its  appli¬ 
cation  is  restricted  to  the  “classical”  cases  of  fixed,  random  or  mixed  effects.  The  ex¬ 
pectations  of  mean  squares  when  the  sampling  of  classification  effects  is  from  finite 
populations  (non-exhaustive)  cannot  be  obtained  by  the  method. 


II.  Description  and  use  of  the  table 
II. I.  Notation  and  linear  model 

The  notation  used  in  this  report  *or  the  expectations  of  mean  squares  and  for 
the  method  of  their  derivation  was  chosen  such  that  on  one  hand  the  table  would  be  as 
easily  readable  as  possible  and  on  the  other  hand  that  the  notation  would  not  deviate  too 
much  from  that  widely  used  in  the  literature. 

Each  criterion  of  classification  (into  “rows”  or  “columns”  or  “slices,”  etc.)  is 
called  a  “factor,”  and  the  classes  within  each  classification  or  factor  are  called  “levels” 
of  that  factor.  This  terminology  is  used  for  simplicity  only,  and  it  may  well  be  noted  that, 
for  example,  the  blocks  in  a  randomized  block  design  also  will  fall  under  this  definition 
of  factor  levels,  the  “factor”  being  that  of  replications. 

The  factors  are  symbolized  by  capital  script  letters  8,  ®.  8,  etc.,  where  these 
letters  never  represent  any  numerical  value.  They  are  used  only  to  identify  the  classifica¬ 
tions,  or  as  arguments,  for  example  in  the  terra  MS  (8)  -  “mean  square  (of  estimated  level 
effects)  for  factor  8.” 

All  terms  related  to  a  given  factor  will  show  this  by  a  symbol  or  a  subscript  which 
is  the  corresponding  letter  or  letter  type  in  the  Latin  or  Greek  alphabet:  The  numbers  of 
levels  of  the  factors  are  A,  B,  C,  etc.;  the  general  subscripts  indicating  the  factor  levels 
are  a,  /9,  y,  etc.  Thus,  one  has 

for  factor  3:  a  =  1 A  ; 

for  factor  !B:  /3  =  1 ,....,  R  ; 

for  factor  8:  y  =  1,  .  .  .  .  ,  C  ;  etc. 

In  the  underlying  linear  model  for  the  analysis  of  variance  “true”  factor  level  (or  “main”) 
effects  are  denoted  by  small  Latin  letters  and  their  respective  interaction  effects  by 
combinations  of  the  corresponding  small  Latin  letters.  The  general  (true)  mean  is  always 
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called  /x.  For  example,  ofl  is  the  (true)  effect  of  level  a  of  factor  3  over  the  general 
(true)  mean  p;  ab is  the  interaction  effect  of  levels  a  and  /3  of  factors  3  and  SB, 
respectively,  etc.  These  factorial  effects  are  defined  in  terms  of  expectations  of  the 
cell  responses  as  is  shown  in  Appendix  A.  In  a  two-way  classification,  for  example, 
such  a  “true”  cell  response  is  denoted  by  Xap. 

In  case  of  random  sampling  from  infinite  populations  one  has  furthermore  the  nota¬ 
tion  for  variances: 


"M  =*a2. 

V  [aia/9]  =  °ab2’  etc- 

For  the  classification  “Replication  within  cells”  the  script  letter  5(  is  used,  with  the 
corresponding  letters  R,  p,  and  r  in  the  relating  terms.  By  that  one  has,  for  example, 
rafip  as  l^e  ‘’residual”  or  error  term  in  the  model  for  a  two-way  classification.  Here, 

then,  p  runs  from  1  to  Rap,  where  Rap  is  the  number  of  replicated  observations  in  the 

cell  which  is  determined  by  level  a  of  factor  3  and  by  level  /3  of  factor  58.  The  residual 
or  error  term  always  represents  the  combination  of  unit  error,  unit-“treatment”  interaction 

and  technical  error.  a2  is  the  residual  or  error  variance,  with  the  three  above-mentioned 
error  sources. 

The  general  subscript  of  an  actual  observation  x  is  composed  of  the  factor  level 
subscripts  determining  the  cell  in  which  x  is  observed  and  of  p,  the  index  of  the  particular 
replication  in  that  cell.  Thus,  for  example,  xapp  denotes  “pth  observation  in  cell  ‘aj8*  of 
a  two-way  crossed  classification.” 

Using  the  terms  explained  before,  the  observation  xaf3p  in  a  two-way  crossed  classi¬ 
fication  is  then  expressed  by  the  linear  model 

xaf3p  ~  %  af3  +  rafip 

=  +  °«  +  bp  +  abal g  +  rapp  . 

Correspondingly,  one  has  for  the  analysis  of  variance  of  a  three-way  crossed  classifica¬ 
tion  the  following  underlying  model: 

x«Pyp  ~  x°Py  +  r“£yp 

-  P  +  aa  +  */9  +  cy  +  abap  +  acay  +  bcpy  +  ab<=a^y  +  ra/3yp  • 

For  further  discussion  of  the  model  see  paragraph  II. 3  below  and  Appendix  A. 
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Whenever  summation  (of  first  order  terms)  takes  place  over  any  one  of  the  subscripts, 
this  subscript  is  replaced  by  a  dot.  Thus,  for  example,  xa>>  means  the  sum  of  all  x-values 

observed  at  level  a  of  factor  0  in  a  two-way  classification.  Correspondingly,  R  means 

the  number  of  all  observations  made  at  level  a» f  factor  0.  Average  values  are  denoted, 
as  usual,  by  bars,  for  example,  =  x^jR^  means  the  average  of  all  observations 

made  at  level  a  of  factor  0. 

In  the  formulas  for  expectations  of  mean  squares  there  also  appear  the  symbols  h'^ , 

k a  ,  .  r  .  .  ,  k k ^  ,  etc.  They  are  defined  in  paragraph  II. 3  below.  The  test  values  F' 

denote  the  approximate  variance  ratios  as  suggested  by  Cochran  (1951).  They  are  ex¬ 
plained  more  fully  together  with  the  numerator  and  denominator  mean  squares  (for  example, 
MS  j((l)  and  MS2((2),  respectively,  for  factor  0)  in  paragraph  II. 4  below.  An  asterisk 

(“*”)  attached  to  an  F -value  (sometimes  additionally  to  the  prime)  indicates  that  this 
variance  ratio  only  approximately  tests  the  stated  nullhy pothesis  by  distributional  reasons. 
This  also  is  further  discussed  in  paragraph  II. 4  below. 

All  cases  for  which  the  mean  square  expectations  and/or  the  variance  ratios  are 
given  in  the  table  are  marked  by  indicative  symbols,  which  are  explained  in  paragraph  D.2 
below.  For  example,  means  a  two-way  classification,  with  factor  SB  of  the 

random  effects  type  (factor  0  being  “fixed”)  and  with  Rap  s  R  replicated  observations  in 

each  cell.  Finally,  the  abbreviation  EMS  is  used  for  “expectation(s)  of  mean  square(s).” 


II.  2.  Scope  of  the  table 

The  table  given  in  section  IV  contains  the  EMS  (expectations  of  mean  squares)  for 
two-way,  three-way  and  four-way  crossed  classifications  when  the  factor  levels  are 
either  randomly  sampled  from  infinite  parent  populations  (“random  effects”)  or  when 
they  are  fixed  (“fixed  effects”)  or  when  the  factor  levels  are  of  both  types  together  in 
any  possible  combination  ("mixed  effects”).  In  the  last  case  “any  possible  combina¬ 
tion”  merely  means  that  either  all  factors  are  "random,”  or  the  first  factor  is  "fixed” 
and  all  others  are  “random,”  or  the  first  two  are  “fixed”  etc.,  or,  finally,  that  all 
factors  are  “fixed”  and  none  are  “random.” 

The  formulas  are  given  for  the  case  of  equal  cell  numbers  R  as  well  as  for  the 
most  general  orthogonal  case  of  unequal  but  proportional  cell  numbers.  As  Wilk  and 
Kempthorne  (1956)  already  put  it,  “a  case  of  ‘proportional  numbers’  can  arise  quite 
naturally  when  there  are  unequal  numbers  of  observations  corresponding  to  only  one 
factor  of  classification.”  Actually,  the  formulas  for  the  EMS  are  simplified  consider¬ 
ably  if  one  goes  from  the  most  general  cases  to  those  where  corresponding  to  one,  two  or 
more  factors  of  classification  the  numbers  are  equal.  Because  of  the  fact  that  they  can 
easily  be  derived  from  the  EMS  for  the  most  general  cases  the  EMS  for  these  "partial 
proportional”  cases,  as  they  may  be  called,  are  not  given.  Naturally,  the  cases  of 
“equal  cell  numbers  R ”  could  equally  as  easily  be  derived  from  the  most  general  ones. 
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These  cases  are,  however,  extensively  listed  because  of  their  frequent  occurrence  in 
practice.  Again  the  EMS  are  not  given  for  the  case  R  =  1  (“one  observation  per  cell"), 
because  they  can  most  simply  be  obtained  from  tbe  cases  with  equal  numbers  R  per  cell. 

In  many  situations  the  testing  of  nullhypotheses  is  only  possible  by  construction 
of  linear  combinations  of  mean  squares  for  the  numerator  and  the  denominator  quantities 
of  the  variance  ratio.  Both  quantities  have  to  be  constructed  such  that  they  will  have 
equal  expectations  if  the  stated  nullhypothesis  is  true.  This  test  procedure  is  only  ap¬ 
proximate,  however,  and  sometimes  requires  cumbersome  derivations  for  the  approximate 
variance  ratio  F'  and  its  effective  degrees  of  freedom,  f  j  and  fn  .  In  order  to  save  the 

user  of  the  table  the  burden  of  deriving  these  formulas,  the  appropriate  test  values  for 
all  nullhypotheses,  along  with  the  necessary  formulas  for  the  degrees  of  freedom,  have 
been  included  in  the  table.  By  a  matter  of  consistency,  however,  this  also  led  to  the 
inclusion  of  well  known  ordinary  F-values,  and  the  author  wishes  to  take  excuse  for  this 
from  the  reader. 

Throughout  the  table  the  approximate  test  procedure  of  Cochran  (1951)  was  applied. 
The  formulas  given  are  more  fully  described  in  paragraph  II.  4  below.  The  author  is 
aware  of  the  fact  that  for  four-  and  more-way  classifications  the  approximations  may  not 
be  good.  For  the  “partial  proportional*’  cases  (as  they  were  called  above  and  for  which 
the  EMS  are  not  given)  only  the  F-  and  F-values  are  listed.  These  test  values  show  how 
the  situation  gradually  simplifies  when  the  numbers  of  observations  per  level  become 
equal  for  one,  two  and  more  factors. 

In  this  connection,  it  is  worthwhile  to  mention  that,  in  testing  nullhypotheses 
concerning  main  effects  of  or  interactions  between  fixed  factors,  it  makes  no  difference 
whether  or  not  the  numbers  of  observations  at  the  levels  of  the  fixed  factors  are  equal, 
the  numbers  of  observations  having  no  influence  upon  the  proper  F-values. 

The  estimation  of  variance  components  is  easily  achieved  by  using  the  formulas 
for  the  F'-  and  F-values.  This  procedure  is  more  fully  described  in  paragraph  II.  4  below. 

Because  of  space  limitations  and  practical  considerations  it  was  decided  to  go 
only  up  to  the  four-way  classification  in  the  present  table.  However,  the  structures  of 
the  formulas  for  the  two-way,  three-way  and  four-way  classifications  indicate  sufficiently 
the  rules  under  which  the  EMS  have  to  be  formed  for  the  general  n-way  classification. 
These  rules  are  given  in  paragraph  II. 5. 

In  order  to  facilitate  the  use  of  the  table  tabular  summaries  for  the  three  classifica¬ 
tions  dealt  with  are  given.  In  each  of  these  summaries  the  different  cases  are  arranged 
in  rows  and  columns  according  to  the  various  models  and  characteristics  of  the  cell 
numbers,  respectively.  The  column  headings  give  the  cell  numbers  expressed  according 
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to  the  proportionality  conditions  imposed  upon  them.  These  conditions  are: 


Ra& 


RclR..Q 


for  a  two-way  classification, 


Rafiy 


Ra.R..p.R..y 

R2 


for  a  three-way  classification, 
and 


Rapy8  m 


RCL,.R.fj, 


R 3 


for  a  four-way  classification. 


For  a  derivation  of  these  conditions  see  Appendix  B. 


Thus,  if  the  numbers  of  observations  at  the  levels  of  factor  SB  in  a  two-way  classi- 

Rrm 

fication  are  equal,  say,  it  implies  that  R^  =  const.  -  and  one  gets  for  this  case 
Ra.R.B  Rau 

Rap  =  — ^  ■■  =  -g-  .  According  to  the  procedure  thus  exemplified,  the  column 

headings  in  the  tabular  summaries  are  formed. 


Each  case  for  which  formulas  are  given  is  marked  by  an  indicative  symbol,  which 
again  is  shown  on  the  page  where  the  EMS  and/or  the  variance  ratios  are  listed.  A  double 
line  frame  indicates  that  the  EMS  and  the  variance  ratios  are  given;  a  single  line  frame 
shows  that  only  the  variance  ratios  are  listed.  The  symbol  is  composed  of  three  parts 
which  are  separated  by  points.  The  first  is  the  number  2,  3  or  4,  showing  the  respective 
number  of  ways  of  classification.  The  next  part  indicates  which  of  the  factors  are 
“random"  by  showing  the  respective  script  letters.  Thus  the  script  letters  which  do  not 
appear  are  those  relating  to  fixed  effects  factors.  The  third  and  last  part  of  the  symbol 
either  consists  of  the  subscripts  of  those  factors  for  which  the  numbers  of  replicated 
observations  per  level  are  unequal,  or  it  is  simply  **R"  or  “1”,  indicating  “R  observations 
in  each  cell”  or  "1  observation  in  each  cell,"  respectively.  In  the  first  case,  therefore, 
the  small  Greek  letters  which  do  not  appear  are  those  relating  to  factors  at  the  levels  of 
which  the  numbers  of  observations  are  equal.  For  example,  “[3. $(2.  a]"  means:  "Three- 
way  (crossed)  classification.  Factor  3  fixed,  factors  $  and  3  random.  Unequal  numbers 
of  observations  at  the  levels  of  factor  3,  equal  numbers  of  observations  at  the  levels  of 
factors  35  and  3.n  Another  example,  namely  [2.3S.RJ;  was  explained  at  the  end  of 
paragraph  (1.1.  It  may  be  noted  that  only  the  typical  cases  received  symbols.  Also 
given  for  each  one  of  the  three  classifications  is  a  table  of  the  corresponding  mean 
squares  in  the  analysis  of  variance  in  their  general  form  as  well  as  in  their  computa¬ 
tional  form. 


Again  because  of  space  limitations,  the  case  of  orthogonal  contrasts  in  fixed 
factors  and  their  interactions  with  other  factors  is  not  treated  in  this  report.  However, 
some  remarks  are  made  in  paragraph  H.6  with  reference  to  a  later  report  which  will  cover 
this  subject. 
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II.  3.  Discussion  of  mean  square  expectation  formulas 

The  derivation  of  the  formulas  for  the  expectations  of  mean  squares  (EMS)  given  in 
this  table  is  exemplified  and  shown  in  Appendix  A  for  the  two-way  crossed  classification 
with  unequal  but  proportional  cell  numbers.  The  method  of  derivation  is  based  upon 
defining  the  components  of  the  linear  model  in  terms  of  expectations  of  the  “true”  cell 
responses.  This  is  done  by  a  generalized  expectation  operation  which  takes  care  of 
both  “random”  and  “fixed”  effects.  Once  the  terms  of  the  linear  model  have  thus  been 
defined,  and  after  some  distributional  assumptions  concerning  them  are  stated,  no  further 
assumptions  about  them  whatsoever  are  made  for  the  sake  of  arriving  at  the  final  mean 
square  expectation  formulas,  and  the  latter  then  follow  in  a  straightforward  process.  For 
the  special  case  of  equal  cell  numbers  in  the  two-way  crossed  classification  with  one 
factor  fixed,  the  other  random,  the  method  used  for  the  present  table  resembles  in  several 
aspects  that  used  by  Scheffe'(1956a)  in  a  critical  paper  about  the  mixed  model.  It  will  be 
noted  that  in  the  model  used  in  the  present  report  (see  Appendix  A)  the  true  interaction 
effects  are  tied  to  the  main  effects  hy  definition.  This  is  a  more  realistic  situation  than 
that  of  “independent”  interactions,  as  has  been  pointed  out  in  another  paper  by  Scheffe' 
(1956b). 

The  relatively  simple  structure  of  the  formulas,  even  in  the  most  general  cases, 
may  surprise  the  reader  who  is  familiar  with  the  formulas  for  the  case  of  proportional 
cell  numbers  in  the  paper  of  Wilk  and  Kempthorne  (1956).  In  their  “Table  3”  (“EMS  for 
special  cases  of  a  two-factor  experiment”)  these  authors  have  -  besides  other  deviations  - 

ct0£2  appearing  in  the  mean  square  expectations  of  both  factors  even  if  these  factors  are 

“fixed.” 

A  sufficient  explanation  for  the  discrepancies  between  the  mean  square  expectations 
obtained  by  these  authors  and  the  expectations  in  the  present  table  may  be  given  in  con¬ 
sidering  the  definition  of  the  true  factorial  effects.  Namely,  in  the  case  of  unequal  but 
proportional  cell  numbers  the  usual  least  squares  estimates  of  main  effect-  and  interaction 
effect-contrasts  would  be  biased  if  related  to  Wilk  and  Kempthorne’s  definition  of  the  cor¬ 
responding  true  contrasts.  If  Wilk  and  Kempthorne’s  true  effects  are  marked  by  asterisks 
but  otherwise  the  notation  of  the  present  report  is  retained,  one  gets,  for  example,  for  the 
contrast  of  the  estimates  of  two  35  factor  effects,  i>o  and  fcjg1  with  (i  /  j8',’  in  a  two-way 
classification  with  (J  “fixed”  and  35  “random”: 

A  A 

E[bp  *  $]=  £[>0.  “  *j 


R 


»» 


R 


+  bh 


lx#  sv 

"  A  ~  A 
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(Here  the  term  on  the  right  hand  side  of  the  inequality  sign  is  the  definition  of 
bp  -  bp i  in  Wilk  and  Kempthorne’s  paper.) 

In  the  method  used  in  the  present  report  the  estimates  of  the  true  contrasts  are  un¬ 
biased,  see  paragraph  2c  of  Appendix  A. 


When  the  cell  numbers  become  equal  the  formulas  for  the  most  general  cases  as 
given  in  the  present  table  reduce  -  with  one  exception  -  to  the  familiar  ones  found  here  and 
there  in  textbooks  and  papers.  The  only  exception  is  that,  in  case  of  mixed  models,  the 
interaction  variance  components  carrying  the  subscript  of  a  fixed  factor  are  multiplied  by 
the  ratio  of  the  number  of  levels  of  that  factor  over  that  same  number  minus  one.  For 

example  in  case  [3.se  .Ki,  o is  multiplied  by  -  in  all  EMS.  Because  the  multi¬ 
plication  by  this  coefficient  is  consistent,  it  does  not  influence  the  familiar  procedure  for 
testing  nullhypotheses.  It  has,  however,  an  influence  upon  the  estimation  of  the  variance 
components  concerned  (in  the  above  example  upon  that  of  oa£^).  The  explanation  for  the 

appearance  of  these  multipliers  is  as  follows.  In  the  method  of  sampling  from  finite 
populations,  the  interaction  variance  in  a  two-way  classification,  say,  is  defined  as 


1 

=  (A*  -  DCS*  -  1) 


A*  B* 

£  £ 

a=l  0  =  1 


Here  “F.P.”  stands  for  “finite  populations”  and  A*  and  B*  are  the  sizes  of  the  respec¬ 
tive  populations  of  factor  levels  from  which  A  and  B  levels,  respectively,  are  randomly 
sampled.  The  subtraction  of  1  from  both  A*  and  B*  is  done  in  analogy  to  the  degrees  of 
freedom  (A  — 1)  and  (B—  1)  in  the  samples.  In  fact,  when  it  comes  to  exhaustive  sampling 
in  factor!}  and  to  sampling  from  an  infinite  population  in  factor  S,  i.e.,.  when  A -*  A*  and 
B*  -*  oo,  one  deals  with  the  classical  mixed  model.  If,  however,  A  -  A* ,  it  does  not 
make  much  sense  to  keep  to  the  analogy  of  providing  one  degree  of  freedom  for  the  esti¬ 
mation  of  the  mean  and  of  subtracting  it  from  A *,  because,  actually,  this  population 
average  then  is  “known”  and  does  not  have  to  be  estimated.  Therefore,  if  one  has  in 
mind  to  make  the  transition  A  -*  A*  it  would  be  more  logical  to  define 

A*  B* 


1 

A* 


EE 

a  =  1  6-  1 


Then,  when  A  -*  A*,  independently  from  B  as  long  as  B  .<  B* ,  one  has  the  relation, 
leaving  the  subscript  “F.P.”  off: 


This,  in  fact,  reflects  the  relation  between  the  interaction  variances  (mixed  models) 
usually  found  in  the  literature  and  those  given  in  the  present  table,  where  then  (ff0£^) 

2  1 

stands  for  the  commonly  used  variance  and  (o0£  )  for  the  one  used  in  this  table.  (For  the 
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above-discussed  example  the  definition  of  (oaj2)  or  simply  of  <7aj2  in  this  report  is 

°ab2  =  E  .  This  definition  appears  to  be  straightforward  if  a  random  sampling 

process  from  an  infinite  population  is  involved  as  is  the  case  here  for  the  levels  of  factor 
®.  See  also  Appendix  A.)  Comments  emphasizing  the  adequacy  of  the  coefficient 

^rather  than  that  of  ^py  were  also  made  by  D.  R.  Cox  and  H.  E.  Daniels  in  the  discussion 

of  the  paper  by  Plackett  (1960). 


So  far,  except  for  the  coefficients  ka,  k^ ,  ■.  .  .  „  ka ,  k  j  ,  ,etc.,  the  formulas  for  the 

EMS  in  the  present  table  are  self-explanatory  after  the  notation  has  been  explained  in  para¬ 
graph  D.l.  The  /(-coefficients  are  introduced  only  for  the  simplicity  of  writing.  The  number 
of  their  apostrophies  indicates  whether  they  relate  to  the  two-way,  three-way  or  four-way 
classification;  their  subscript  indicates  to  which  factor  they  belong.  Thus,  for  example, 
the  (-coefficient  relating  to  factor  ®  in  a  three-way  classification  is  defined  as 


k 


III 

b 


Here  ft  a  and  ft  are  the  numbers  of  observations  at  level  B  of  factor®  and  the  total 

•  pt  ••••• 

number  of  observations  in  the  whole  analysis,  respectively.  These  (-coefficients  are 
explicitly  defined  in  the  table  for  all  three  classifications. 


The  EMS  for  a  two-way  classification  with  proportional  cell  numbers  when  both 
factors  are  random  (case  [2.  3®.  a/3]  in  the  present  table)  seem  to  have  first  been  given 
by  H.  Fairfield  Smith  (1951).  The  expectations  of  mean  squares  of  H.  F.  Smith  take  on 
exactly  the  form  of  those  presented  in  this  table  if  one  replaces  his  terms  by  the  symbols 
of  the  present  table  as  follows: 


a  by 


yfiT 


b  by 


■•0 


v/R~ 

SW  -  CSa)2  -  So2  by  R  {  1 


R"(‘ 


3'bb'b y  ft.U-**') 


1 


ft  U-*") 
**• 


N  by  R"' 

p  by  A  and  Va  by  oa2 
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q  hy  B  and  V ^  by  a^2 
vap  by  aab 2  and  v0  hy  °r 2  • 

A  final  remark  concerns  the  correctness  of  the  formulas  presented.  After  the  mean 
square  expectation  formulas  had  been  derived  they  were  checked  by  application  of  the 
rules  for  obtaining  EMS’a  as  given  in  paragraph  II. S  below.  Complete  conformity  was  ob¬ 
served  in  all  cases. 


II .  4.  Approximate  E-tests  and  the  estimation  of  variance  components 

The  appropriate  variance  ratios  for  testing  all  nullhypotheses  are  listed  in  the 
present  table  in  order  to  save  the  user  the  burden  of  deriving  them,  especially  in  the 
complex  cases,  as  mentioned  earlier.  This  listing  of  variance  ratios  may,  however,  also 
be  of  help  for  the  planning  of  experiments  or  sample  surveys  in  that  it  shows  under  which 
conditions  exact  F-tests  will  be  available  and  under  which  they  will  not.  This  also  is 
the  main  reason  why  the  F-  and  F  -values  are  listed  for  the  “partial  proportional’’  cases 
(where  the  EMS  are  not  ocplicitly  given).  By  that  the  reader  may  see  at  a  glance  also  in 
these  cases  in  which  situation  with  respect  to  the  testing  of  nullhypotheses  he  is  or  will 
be. 

The  numerator  and  denominator  quantities  and  their  effective  degrees  of  freedom  /j 

and  respectively,  for  the  approximate  F  -tests  are  given  following  Cochran  (1951)  and, 

for  the  degrees  of  freedom,  Satterthwaite  (1946).  These  two  quantities*  which  are  actually 
linear  combinations  of  mean  squares  from  the  analysis  of  variance,  are  denoted  (for  testing 

a  nullhypothesis  ctq2  =*  0,  say)  by  MSj((f)  and  MS2  (fi),  in  analogy  to  the  numerator  mean 

square,  MS  ((?),  in  the  ordinary  case.  The  expectations  of  the  two  quantities  are  equal  if 

the  ftullhypothe sis  is  true.  Thus,  in  the  above  example,  F[MSj((?)]  =  £[MS2(3)]  if 

oQ2  =  0  is  true.  The  subscripts  1  and  2  always  refer  to  the  numerator  and  the  denominator 

in  F',  respectively. 

Cochran  proposes  to  have  all  coefficients  positive  in  both  the  numerator  and  the 
denominator  linear  combination  of  mean  squares  because  such  linear  forms  are  better  rep¬ 
resented  by  a  Type  III  approximation  than  those  where  some  coefficients  are  negative. 

The  two  quantities  have  always  been  constructed  according  to  this  suggestion.  In  fact, 
the  coefficients  as  given  in  the  table  for  the  cases  of  proportional  cell  numbers  (where 
the  absolute  value  of  the  coefficients  is  not  simply  unity)  will  never  be  negative.  This 
is  trne  because  the  ^-coefficients,  as  can  easily  be  seen,  are  always  smaller  than  one,  and 

products  like  AQ,  Bk£ -e tc.,  as  appearing  in  the  coefficients  for  the  residual  mean  square, 

MS(X),  are  always  greater  than  or  equal  to  one.  (The  last  statement  can  easily  be  proven 
geometrically.)  The  case  in  which  these  products  are  equal  to  one  (whereby  the  residual 
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mean  square  drops  out  of  the  corresponding  linear  combination  because  of  A1<a  —  1  »  0, 
for  example)  arises  when  the  numbers  of  observations  at  the  levels  of  the  corresponding 
factor  become  equal.  More  explicitly,  for  the  above  example  of  a  two-way  classification: 


R 


a. 


const. 


will  make  k 


I  I 

a 


2  R\9  =  and  thereby  Aka 


-1  =  0. 


The  numerator  quantity  in  F'  is  always  such  that  its  expectation  is  larger  than 
that  of  the  denominator  quantity  if  the  alternative  hypothesis  is  true. 

In  his  above-mentioned  paper  Cochran  discusses  the  accuracy  of  the  F'  ‘approxima¬ 
tion  only  for  the  case  when  3  variances  (mean  squares)  are  involved  in  testing  a  null- 
hypothesis.  This  accuracy  is  rather  good  as  shown  in  his  Table  III  (“True  significance 
probability  of  F'  at  the  apparent  5%  level").  However,  there  can  be  no  doubt  that  the 
accuracy  decreases  considerably  if  many  mean  squares  are  involved  in  F'  as  is  the  case 
in  some  F'-values  in  the  four-way  classification,  for  example.  Therefore,  it  is  recom¬ 
mended  to  use  the  F'-values  in  this  table,  in  which  more  than  three  mean  squares  are  in¬ 
volved,  with  caution.  There  will  be  no  sense  in  keeping  exactly  to  the  5%  significance 
level,  say,  of  the  F-distribution.  Only  a  definitely  significant  F'-value  or  a  definitely 
non-significant  F'-value  will  lead  to  a  statement  whether  the  nullhypothesis  should  be 
rejected  or  not  rejected,  with  a  no-decision  region  of  0.01  .<  a  .<  0.10,  say. 

The  approximate  degrees  of  freedom  and  f2  of  F'  will  always  have  to  be  rounded 

to  the  nearest  integer  in  order  to  compare  the  computed  F'-value  with  the  tabled  percentage 
point  of  F. 

Another  type  of  approximation  in  testing  nullhypotheses  by  variance  ratios  in  the 
analysis  of  variance  is  present  if  the  underlying  model  is  of  the  mixed  effects  type.  In 
this  case  all  those  variance  ratios  which  test  for  fixed  main  effects  and  fixed  or  mixed 
interaction  effects  and  whose  denominator  quantities  have  expectations  larger  than  or2, 
the  error  variance,  are  not  distributed  as  F  under  the  stated  nullhypothesis.  This  and  the 
fact  that  the  nullhypothesis  on  the  fixed  factor  effects  in  a  two-way  classification  with 
underlying  mi  xed  model  can  be  tested  by  Hotelling's  T2  has  been  discussed  by  Scheffe' 
(1956a).  This  author,  however,  doubts  whether  the  exact  test  procedure  is  worth  the 
extra  computational  labor.  Later,  Scheffe' (1959)  almost  ruled  out  the  application  of 
Hotelling's  T2  for  cases  of  mixed  models  in  three-  and  more-way  classifications  in 
saying  that  in  these  cases  its  use  is  “numerically  so  complicated  that  it  is  unlikely 
ever  to  be  applied  in  practice."  Following  Scheffe',  however,  Imhof  (1960)  has  given 
exact  test  procedures  for  the  three-way  classification  with  one  factor  fixed  and  the  other 
two  random. 

Up  to  now  it  seems  to  be  unknown  how  good  the  approximation  of  the  “classical" 
F-values  is  in  these  situations.  In  the  present  table  they  are  marked  by  asterisks  (***"). 
The  user  of  this  table  should  keep  in  mind  the  approximate  character  of  these  F-valnes 
when  applying  thorn  to  test  a  stated  nullhypothesis.  This  especially  will  apply  when 
both  the  prime  and  the  asterisk  are  attached  to  an  F-symbol,  thus  indicating  “double*’ 
approximation. 


11 


NWL  REPORT  NO.  1833 


The  estimation  of  wiaace  components  is  easily  achieved  with  the  help  of  the  listed 
F-  and  F'-Values.  As  a  general  rale  the  following  can  be  stated: 


Estimate  of 
Variance 
|_Component  J 


/Numerator  quantity  of  the  F-  \ 


or  F'-value  for  testing  the 
\corresponding  nullhypothesis/ 


('  Denominator 
quantity  of  this 
,F-  or  F'-value, 


t Coefficient  of  variance  component  to 
be  estimated  in  numerator  quantity 

Thus,  for  example,  in  case  .a^y],  one  has  as  the  best  estimate  of  o^: 


o 


a,  MSj (fl)  -  MS2(ff) 

*a  “/?...(!  -  A”)/  C4-1)  * 


For  the  sampling  variances  of  variance  component  estimates  the  reader  is  referred  to  the 
papers  by  Crump  (1951),  Tukey  (1956),  Welch  (1956)  and  Searle  (1958). 


II.  5.  Generalization  for  the  n-way  crossed  classification 

Mean  square  expectations  for  the  n-way  crossed  classification  can  be  induced  from 
those  given  in  this  table  for  the  two-way,  three-way  and  fow-way  classification.  The  rales 
of  doing  this  are  given  below.  They  could  be  separately  listed  for  all  models  and  for  pro¬ 
portioned  as  well  as  for  equal  cell  numbers.  For  the  sake  of  simplicity,  however,  the 
rules  are  given  for  the  case  of  "proportional  cell  numbers,  all  factors  random"  only,  with 
additional  rules  of  how  to  change  these  formulas  if  one,  two,  or  more  factors  are  fixed. 

Any  case  of  equal  numbers  can  then  easily  be  obtained  by  equating  the  k-coefficients  to 
the  reciprocals  of  the  numbers  of  levels  of  the  corresponding  factors.  (For  example,  in 
the  two-way  classification,  if  there  are  equal  numbers  of  observations  at  the  levels  of 

factor  d,  one  substitutes  k0'  =  j  .) 

1.  Rules  for  obtaining  expectations  of  mean  squares  in  an  n-way  crossed 
classification,  case  of  unequal  but  proportional  cell  numbers,  all  factors  random. 

la.  Each  mean  square  expectation  contains  the  residual  variance  component  a f 
(with  coefficient  1);  further  it  contains  all  variance  components  (with  coefficients  as 
described  in  lb.  below)  with  subscript  combinations  containing  those  small  Latin  letters 
which  correspond  to  the  script  letters  in  the  designating  symbol  of  the  respective  mean 
square.  Thus,  for  example,  the  expectation  of  the  mean  square  for  the  first  order  interac¬ 
tion  j&3  in  the  five-way  classification  will  contain  of  plus  the  following  components 
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(with  coefficients  as  indicated  in  lb.  below):  c%bcde,  <%bcd,  o*bce,  <>abde>  aabc  >  °lbd  > 

°lbe  •  an<!  alb- 

In  general,  in  an  n-way  classification  with  all  factors  random,  the  expectation  of  a 

mean  sqnare  designated  by,  say,  f  script  letters  thus  will  contain  a r2  pins  2*  ^additional 

variance  components  related  to  main  effects  and  interactions.  Main  effects,  therefore, 

are  characterized  by  f  =  1  and  their  expectations  will  contain  ar2  pins  2"'^  additional 
variance  components. 

lb.  The  coefficients  of  the  2n'f  variance  components  are  obtained  as  follows: 

Each  component  is  multiplied  by  the  total  number  of  observations  in  the  n-way  classifica¬ 
tion.  (In  a  five-way  classification,  say,  R  is  the  symbol  for  this  number.)  For  ench 

Latin  letter  in  the  subscript  of  the  variance  component  which  corresponds  to  a  designating 
script  letter  in  the  respective  men  square,  there  will  be  a  coefficient  (1-&)  divided  by  the 
number  of  degrees  of  freedom  corresponding  to  that  particular  script  letter.  Thus,  in  the 

above  example,  all  8  variance  components  (not  including  o^)  in£[MS  ((I®)]  will  have 

u-Ou-*7* 

(d-l)(B-l) 


as  a  common  coefficient.  Corresponding  to  all  other  Latin  letters  in  die  subscript  the 
variance  component  will  be  multiplied  by  a  ^-coefficient  with  that  very  subscript.  Thus, 
finally,  in  the  above  example,  the  coefficient  of  °abcde  *n  (fl®)],  say,  will  be 


R 


u-0(i-C>C*7C 

(A  -  1)(B  - 1) 


2.  Rules  for  deriving  mean  square  expectations  from  the  formulas  obtained  under 
la.  and  lb.  above  for  the  case  of  unequal  but  proportional  cell  numbers,  one  factor  or  more 
fixed,  the  others  random. 

2a.  If  a  factor  is  fixed  all  variance  components  containing  the  corresponding 
small  Latin  letter  in  their  subscripts  are  deleted  in  the  expectations  of  those  mean  squares 
among  whose  designating  script  letters  is  not  the  script  letter  of  the  fixed  factor.  Thus, 
in  the  before-mentioned  example  of  a  five-way  classification,  if  factor  d  is  fixed,  say,  the 

components  a\bcde,  ^abed’  °abce  aD<*  aabc  arc  deleted  in  £[MS  (d®)l.  They  are,  how¬ 
ever,  not  deleted  in  the  expectation  of  MS  (fiC),  for  example. 

2b.  In  the  expectations  of  those  mean  squares  among  whose  designating  script 
letters  is  that  of  the  fixed  factor,  the  coefficient  (1-fc)  corresponding  to  the  fixed  factor 
is  deleted.  In  the  above  example  of  a  five-way  classification  with  factor  £  fixed, 
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(i  -  O'  is  thus  deleted  in  the  common  multiplier  of  all  8  variance  components  (not  in* 
eluding  of)  in  £[MS  ((?(?)]■• 

2c.  In  the  expectation  of  the  mean  square  due  to  the  fixed  factor  die  term  con* 
taining  the  corresponding  variance  component  is  replaced  by  the  weighted  sum  of  the 
squared  (true)  level  effects  divided  by  the  corresponding  degrees  of  freedom.  Thus,  in 

the  example,  if  again  factor  C  is  assumed  to  be  fixed,  £,r’*  of  in  ElMS(0]  (after 
application  of  rule  2b.)  is  replaced  by 


2d.  If  two  or  more  factors  are  fixed,  rules  2a.  -  2c.  simultaneously  apply  with 
respect  to  all  fixed  factors.  Moreover,  the  terms  including  interaction  variance  components 
due  to  fixed  factors  only  are  replaced  by  the  corresponding  weighted  sums  of  squared 
(true)  interaction  effects,  divided  by  the  respective  degrees  of  freedom.  Thus  in  the  five- 

D 

way  classification  with  factors  Q  and  8  fixed,  say,  the  term  ^ — l)*(fi*'l)  °ab 2  *n 
£[MS(fl8)]  (after  application  of  rule  2b.)  is  replaced  by 


II.  6.  The  case  of  orthogonal  contrasts  in  fixed  factors. 

In  case  of  orthogonal  contrasts  in  fixed  factors,  i.e.,  when  the  (overall)  sum  of 
squares  for  main  effects  of  a  fixed  factor  or  for  interaction  effects  involving  one  or  more 
fixed  factors  is  split  into  independent  components  with  one  depee  of  freedom  each,  the 
situation  with  respect  to  expected  values  and  testing  of  nullhypothe ses  is  not,  in  general, 
a  priori  obvious. 

However,  if  one  deals  with  a  fixed  model  it  is  not  difficult  to  see  that  such  a  single 
degree  of  freedom  component  will  have  expectation  of  if  the  corresponding  nullhypothe  sis 
is  true.  This  applies  both  for  unequal  but  proportional  and  for  equal  cell  numbers.  There¬ 
fore,  in  this  case  of  a  fixed  model,  one  will  have  as  denominator  quantity  in  the  respective 
variance  ratio  the  mean  square  for  replications  or  the  mean  square  for  the  highest  order 
interaction  (if  the  latter  can  be  assumed  not  existent),  respectively,  depending  upon 
whether  or  not  one  has  replicated  observations  in  the  cells. 

In  the  case  of  a  mixed  model  with  unequal  but  proportional  cell  numbers  the  ex¬ 
pectations  of  the  said  orthogonal  components  are  such  that  the  nullhypotheses  concerned 
cannot  be  tented  following  the  pattern  given  for  the  (overall)  mean  squares  in  this  table. 
The  expectations  of  the  components  and  adequate  procedures  for  testing  the  correspond¬ 
ing  nullhypotheses  will  be  given  in  a  later  report. 
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The  situation  with  respect  to  orthogonal  contrasts  simplifies  to  a  given  extent 
when  in  case  of  mixed  models  the  ceil  numbers  are  equal.  This  also  will  be  discussed 
in  the  above-mentioned  later  report. 
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F*  ,  F'*:  Prime  and/or  asterisk  attached  to  /•'-value: 

Variance  ratio  only  approximately  distributed  as  F  under  stated  nullhypothesis  Hn.  See  paragraph  II. 4. 
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Description  of  the  method  ns  e  d  {or  the  derivations  of  mem  iqaate  expectation  a. 

1.  Definition  of  a  generalized  expectation. 

In  deriving  the  mean  square  expectationa  as  they  are  listed  in  this  report  nae  waa 
made  of  a  generalized  expectation  which  will  be  defined  below.  This  expectation  opera¬ 
tion  is  applied  to  the  “cell1'  reaponsea  which  are  “true”  in  the  sense  that  they  constitute 
one  of  the  two  additive  components  in  the  assumed  underlying  model  for  the  analysis  of 
variance.  (The  second  component  is  the  residual  or  error  term  from  a  normal  population 
with  expectation  zero  and  variance  or2.) 

The  generalized  expectation  (as  well  as  the  entire  method)  will  be  explained  with 
the  example  of  the  two-way  crossed  classification.  Here,  one  has  the  model 


xa/3  p  ”  ^a/B  +  ra/8p 


where  *aa.p  is  the  pth  observation  in  cell  “a/9”,  or:  the  pth  observation  at  level  a  of 

factor  d  and  at  level  /3  of  factor  3,  with  a  -  1, . A ,  /9  -  1, ...  .  ,  B  and  p  =  1, .... . . , 

R-afi-  ^a/9  *8  ^e  number  of  replicated^observations  in  cell  “a/9”  and  is  assumed  to  meet 

the  proportionality  condition  Rap  ®  ,  (see  Appendix  B.) 

ra/9p  ‘8  the  above-mentioned  error  term  with  the  assumption  raftp  ~  NID(0,or2) . 


Finally,  Xap  denotes  the  “true”  response  of  the  random  variable  under  consideration 

in  cell  “a/ 3”,  no  matter  whether  random  or  fixed  effects  are  represented,  by  factors  d  and  8. 
In  the  following,  Xap  and  its  generalized  expectation  is  defined  for  the  three  cases:  la. 

and  8  random,  lb.  d  and  3  fixed,  lc.  d  fixed,  3  random. 


la.  Case  of  both  factors  random 


Here  XQp 


is  defined  as  follows: 


where  f(x,y)  is  a  unique  but  unknown  function  of  the  two  “carrier  variables”  x  and  y. 
These  carrier  variables  are  continuous  but  otherwise  unspecified  random  variables,  re¬ 
lating  to  the  3-  and  3-  classification,  respectively,  with  joint  probability  density  func¬ 
tion  p  (*,y).  Thus,  level  a  of  factor  d  is  uniquely  defined  by  the  sampled  value  x  »  *a, 

and  level  /9  of  factor  3  correspondingly  by  the  sampled  value  y  ■  y^g .  Now,  using  the 

notation  X^g  m  f  (x,y^g) ,  the  expectation  of  X„p  with  respect  to  die  carrier  variable'  x  for 
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■ay  given  y  m  Tffl  ie  defined  as: 


/  C*,y  |  y 


yp)p(*,y  \y  -  yp)dx  . 


The  left  hand  side  will  be  abbreviated  to  E  because  the  above  opera¬ 

tion  is  analogous  to  an  averaging  over  subscript  a  for  any  given  y  «  yp.  If  further  the 
marginal  probability  density  function  of  y. 


P/3(y)  ‘ 


/ 

—  eo 


p( x,y)dx 


is  introduced,  one  gets 


E 

a 


+«• 

M  ■  / 

—  oo 


fl*,yp) 


p(*.X/3> 

P/9  V 


dx  . 


Because  this  will  hold  for  any  sampled  value  y 


y^g,  the  definition  can  finally  be  written 


E 

a 


—  OO 


dx 


Correspondingly,  one  has 


E 

P 


+  PO 

M  ■  / ' 


dy 


The  expectation  of  X„p  with  respect  to  both  carrier  variables  x  and  y  is  defined  by  attach¬ 
ing  both  subscripts  a  and  fi  to  the  expectation  symbol: 

+  »  +•• 


E 

afi 


M-// 


f(x,y)p(x,y)dxdy  . 
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In  the  derivations  of  expected  mean  squares  repeated  nse  must  be  made  of  the 
following  relation  between  the  three  above-defined  expectations: 

This  property  holds  if  the  product  /(*,y) p(*,y)  is  continuous  for  (— »  <*.<+«>,—  <»  <y  <+<*>). 
This  can  readily  be  assumed,  so  that  the  relation  will  be  generally  valid. 

Proof.  One  has 

“ E  [<p(*)]' 

where  the  random  variable 


has  probability  density 


paU)  = 


+  « 

J  pC*,y)dy 

—  OQ 


The  expectation  of  <p(x),  being  that  with  respect  to  the  carrier  variable .x,  will  be  expressed 
according  to  the  present  notation  by  attaching  the  subscript  a  to  the  expectation  symbol: 


+  oc 

j  <p(x)pa(x)d* 


—  O0 


pa(*)d* 


/■(*,y)p(*,y)d*rfy, 
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because  the  sequence  of  integration  can  be  interchanged  if  the  integrand  is  everywhere 
continuous.  The  last  term  by  definition  is  equal  to  E[Xao],  bo  that 

a/3L 


Correspondingly  it  can  be  shown  that 

E 

/3 


?[>M]  -4M 


q.  e.  d. 


lb.  Case  of  both  factors  fixed 


Here  the  definition  of  *«/3  iB: 


xap  =  e(*a,yp) 

with  xQ  and  yp  being  discrete  but  otherwise  unspecified  carrier  variables  (non-random) 

which  take  on  the  values  and  only  the  values  Xj,....,  xa, . *4  and  y  . .  . . , 

yp, . . . . ,  y g ,  respectively.  g(xaiyp)  is  another  unique  but  unknown  function  of  the  car¬ 
rier  variables.  Because  these  are  non-random,  there  will  be  no  possibility  of  taking  the 
usual  expectations  of  Xap  in  this  case.  By  analogy  to  the  usual  expectations  and  for 

later  use  in  the  model,  however,  the  weighted  averages  of  Xag  are  formed,  making  use  of 

RaR'P  P 

the  proportionality  condition  Rap  =  — ^  ‘  ; 


2R„  Xaa 

a  *•  aP 


2R„o 

a  aP 


2R 

A 


apSl*a,yp) 


2R 


afi 


|R./3*«j3 

K7. - 


2  2RaR 
‘i 3  afi 


R.. 
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lc.  Case  of  one  factor  fixed  (fl,  say),  the  other  (8)  random.  (Mixed  model). 
As  a  combination  of  cases  la  and  lb,  here  Xap  is  defined  as: 

,-Tf, 


where  y  myp  ia  «  sampled  value  of  tbs  continnons  random  variable  y  and  where  *a  takes 
on  the  valnes  and  only  the  values  x^,  .  .  .-. ,  *flt. x^  ..  y  has  probability  density 
function  p(y),  say.  A(*a,y)  is  a  third  nniqne  but  unknown  function  of  the  carrier  variables. 
Then  the  expectation  with  respect  to  y  is  defined  to  be 

jg[*«/s]  “  /  A(*a«y)p(yVy 

—  oo 

With  respect  to  the  non<random  variable  *a,  the  weighted  average  of  X ^  is  formed  for 
later  use  in  the  model: 


JRaj8A(*a»r) 


3 

R.. 


By  analogy  of  taking  expectations  with  respect  to  both  carrier  variables  the  following 
term  is  formed  for  later  use: 


T  |*«/9A(xa>y) 

J  - 

—  OO  “ 


1Ra.% 
a  B 

R.. 


The  expectations  and  their  analogues  as  introduced  above  for  the  three 
models  will  be  used  for  the  definition  of  the  model  components  (section  2,  below). 

In  deriving  the  properties  of  the  model  components  expectations  will  have  to 
be  taken  also  of  the  error  term  rafipm  In  accordance  with  the  notation  used  before,  in 

these  cases  the  subscript  p  will  be  attached  to  the  expectation  symbol.  It  ia  obvious 
that,  for  example, 

£  [‘«p]  ■£[?|V«0f-]]  • 
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In  thin  connection  it  may  also  be  mentioned  that  the  expectation  "with  reapect 
to  p”  applied  to  the  actual  obaervation  *a(ip  u«preaaedhy  the  baaic  model,  xaf}p  “  ^a/3 
+  rafipt  lende  to  the  "true”  cell  reaponee: 


E 

P 


Finally,  the  definition  E 


la/3 


or*  in  need  for  all  three  modela. 


2.  Definitiona  and  propertiea  of  model  componenta. 

In  the  following,  the  definition  and  propertiea  of  the  model  componenta  are  exem¬ 
plified  with  the  two-way  claaaification  for  the  random,  fixed  and  mixed  effecta  caae.  The 
generalisation  for  three-  and  more-way  claaaificationa  can  eaaily  be  made. 

Generally  (for  all  three  modela  in  the  two-way  claaaification),  the  true  cell  reaponae 
Xap  will  be  aplit  up  into  a  general  mean  p,  plus  the  effect  of  level  a  of  factor  0,  plus  the 

effect  of  level  /3  of  factor  S,  plua  the  interaction  effect  oh^.  Therefore,  the  general 
model  for  the  two-way  claaaification  will  read: 

*o/9 p  “  *o/3  +  ro/9p  "  /*  +  °o  +  hfi  +  abap  +  ro/3p  ‘ 

2a.  Caae  of  both  factora  random 


Here  the  following  definitions  are  made: 

■  *4>  -  |M  -f  M  ♦  *,[*«»] 

It  will  be  noted  that 

p  +  oa  +  h/g  +  a  m  Xap 
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Also  it  can  be  seen,  inserting  the  above  definitions  of  the  model  terms  and  making  use  of 
the  theorem  on  the  expectation  operations  as  proven  in  section  1,  that: 

£[.„]- 0 

fh*]  ■£[•*■*] ’^tVI 

In  the  same  way,  the  following  covariances  are  shown  to  be  zero: 

-° 

Correspondingly: 

4)(WI 


are 


Considering  the  fact  that  the  values  xQ  and  yp  of  the  carrier  variables  x  and  y  in 

XaB  “  I 

p  I*  -  vr  -y/3 

randomly  sampled  from  continuons  distributions,  the  covariances  C^v  (xfl »*rf)  and 


Cov  (yp,yp i)  are  zero  by  definition.  This  in  turn  implies  that 

E  [“«“«■]  -° 
for  and  /3//3',  respectively,  and 

£  [a&o/3airf/8j]'“  0 

for  a/S^a'/S1. 
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la  the  last  statements  the  expectation  operation  ia  the  usual  one,  i.e.,  it  is  clear  with 
respect  to  which  random  variables  the  expectations  will  have  to  be  taken.  Definitions  of 
these  expectation  operations  by  attaching  subscripts  to  the  expectation  symbols,  analogous 
to  the  definitions  introduced  before,  would  lead  to  further  but  actually  unnecessary 
formulas.  The  usual  expectation  symbol  will  also  be  applied  in  all  further  derivations 
of  this  appendix  wherever  its  meaning  ia  obvious. 

Finally  for  this  case  of  both  factors  random,  the  following  definitions  are  used  for 
the  expectations  of  the  squared  model  terms  as  they  appear  in  the  expectations  of  the 
mean  squares: 

£[°a]  *  °a 

E  [bl]  “  °b2 

E[abZp]  = 

For  the  F -tests  in  the  analysis  of  variance  only,  the  assumption  is  made  that  the  aa,  bp 
and  <*bap  are  normally  distributed.  (That  they  are  also  independently  distributed  with 
expectations  zero  followed  from  their  definitions  as  shown  above.) 

2b.  Case  of  both  factors  fixed 

For  this  case  the  model  components  are  defined  as  follows: 

2  2  R„rX„r 

«fl  aP  aP 

n - e - 

R.. 

2  R  aX  a  2  2  RnoXaa 
P  * P  aP  a  p  aP  aP 

°°  R7.  r7. 

‘p - V7. - *7. - 

a 

y.pXad  lRa.  **p  ljRafi*ap 

abap  -  xap  -  " +  — 
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Again,  one  has 

/*  +  °a  +  9  +  °*a/9  *  *a/9 

As  can  readily  be  verified,  the  following  relations  hold: 

-  0 

cl  a 
a 

ZR:pbp  m  0 

Sfi.  ab  a  ■  XS  a  ob  a  m  0  •. 

a  “•  aP  p  ■•P  aP 

2c.  Case  of  one  factor  ((?)  fixed,  the  other  ($)  random.  (Mixed  model). 

Making  proper  nse  of  both  types  of  definitions  one  has  for  this  case: 


As  before, 


"  — fc — 

-«[**]  - 


p 


/?•»•  Kn 


•Vo  -  ^  -  |[M  -  ~ 


#»  +  °a  +  fc/3  +  oio/3  ■  Xafi  , 

It  will  also  be  noticed  that  aQ  is  no  random  variable,  whereas  6^  is  one  and  a&a^  is  one 
for  any  given  a. 
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Here  one  haa  the  following  relations: 


2  Rn  a  «  0 
a  a.  a 


2  R  ai  /j  «  0 

a  “•  aP 

|  M  ■ 0 
^[“6«d  ■ 0 


It  is  of  importance  that  ^  0  and  2£  ^a^c.H  ^  further  it  is  worthwhile  to 

mention  that  the  term  £  a^af3j  does  not  appear  in  the  derivations  of  expected  mean 
squares  for  this  case.  Naturally, 


Pp 


Finally,  the  variances  are  defined  for  this  case  as  follows: 

£  [‘|]  -  n* 

E  [o6a/s]  "  <%b  . 

the  latter  under  the  assumption  that  £  is  constant  for  all  A  levels  of  the  fixed 

factor  3.  L  J 


For  the  £-tests  only  it  is  further  assumed  that  the  b ^  and  the  obap  (the  latter  for 
all  A  levels  of  3)  are  normally  distributed. 

For  this  case  of  the  mixed  model  only  it  will  be  demonstrated  below  that  the  dif¬ 
ferences  between  the  estimates  of  any  two  main  effects  and  the  second  differences 
between  estimates  of  interaction  effects  are  unbiased.  (The  same  property  can  be  shown 
to  be  valid  for  the  estimates  in  the  other  two  models  and,  equally,  for  those  in  three- 
and  more-way  classifications.) 
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As  is  well  known,  one  gets  ss  least  squares  estimates: 


a.  ■  *  —  * 

U  Ass  *>#» 


-  *.0. 


ah 


'a/9  "  *a/3.  "  *«,.  "  *.0.  +  * 


The  contrasts  which  have  to  be  shown  to  be  unbiased  are  consequently  (a  4  (t  i  1 9  /  ft'): 

A  A  _  _  _ 
aa  ~  ad  ~  xa..  ~  *&■.. 

hf>  -  4  ■  v  -  V- 

AAA  A 

abaf 3  ~  abctp  ~  abap<  +  abctf}'  *=  *a/J.  “  *a*/3.  “  *a/9'.,  +  V/S’.. 

Making  use  of  the  model  xapp  ”  <^ajg  +  ra^p  and  of  the  fact  that  for  the  present  mixed 

model  the  expectation  of  the  first  contrast  has  to  be  taken  “with  respect  to  /3  and  p“ 
and  the  expectations  of  the  other  two  contrasts  only  “with  respect  to  p“,  one  gets: 

0PL  J  /3p|_  *«•  Ra'-'  J 


°a  -  a 


a*  ; 


rA  a  i  fee* +  r‘ 


_li(x 


f«i8-  +  ra/3'p) 
_ 


f«a/a/3  f«a.V 


IT 


ITT" 


bP  -  bp,  . 


ll 


NIL  REPORT  NO.  1833 


and,  correspondingly: 


A 

abafi  ~ 


abt 


’«'J9 


*a/3  "  Xd\ S  "  *a/3'  +  *a,/3' 

a6a/s  -  ab^p  -  ab^,  +  ab^p , 


3.  Derivation  of  EMS-formnlas. 

In  this  section  the  derivation  of  the  formulas  for  the  mean  square  expectations  will 
be  exemplified  with  the  case  of  the  mixed  effects  model,  i.e.,  the  case  of  factor  fl  fixed, 
factor  8  random.  This  example  will  show  all  essential  principles  of  the  derivations 
which  have  been  applied  to  obtain  all  expectations  contained  in  the  table  of  this  report. 

First,  the  average  values  of  the  actual  observations  \pp  ate  expressed  in  terms  of 
the  underlying  model, 


*a/3p  “  f*  +  °a  +  6/S  +  abal 3  +  ra/3p  * 
using  the  dot-notation  for  the  summation  over  a  subscript: 

*«/3.  -  +  “a  +  &/9  +  ai«/3  +7T^ 

£  R  a  bo  2.R  a  ab  a 

fi  "f*  P  /3  °P  'a,. 
*a..  =  *  +  aa  + - TC—  +  —R7. - +RT 

,  L 

*  fl  *  It  +  Op  +  n' 

IX  n  &p 

/3  -p  p  r,„ 

■  p  +t-7T^-  +7T7 


(In  these  expressions  use  has  already  been  made  of  the  relations  afl  «  0  and 
'S,RQabap  -  0  as  they  were  found  in  paragraph  2c,  above.)  s 
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Substituting  the  above  expressions  for  the  averages  in  the  mean  square  terms  one  gets: 


£[ms«3>]  -?„)■] 


2/?  O  Cl  b  o 

ft  af>  .  'a.. 

~K. —  +*T'C)J 


After  performing  the  squaring  and  summation  and  in  nsing  the  definitions  and  relations 
from  paragraph  2c,  above,  one  gets: 


1R 


/3 


£  [MS(fl)]  -  a2  +  jzy  °af>2  +  T=T 

■si  ® 

=  °r2  +  -T^r°ab2  +  Ra.a«2 


with  In.  =  X/i2fl 

"6  r2q  -P 

•ss  r 

The  last  form  is  that  which  is  given  for  case  [2jB.a/3]  in  the  table  of  this  report. 
Further  one  has: 


£[MS(®)]  =  E 


°r2  + 


R  (1  -*?)  , 

- — - *-OL2 

B- 1  b 
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Correspondingly,  the  other  two  expectations  are  obtained: 

r  -|  R  (1-A.")  n 

£  [Mstasi]  -  v  +a5)un$)  * 

E  [  MS(»)  ]  -  a2 
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APPENDIX  B 
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Proportionality  conditions  for  ceil  numbers  ia  crossed  classifications. 

In  order  to  obtain  an  orthogonal  decomposition  of  the  total  sum  of  squares  in  the 
analysis  of  variance  for  crossed  classifications  with  unequal  numbers  of  observations  in 
the  cells,  all  sums  of  products  have  to  be  identically  zero.  This  only  is  achieved  when 
the  so-called  proportionality  condition  holds  for  the  cell  numbers.  The  condition  will  be 
derived  below  for  the  two-way  crossed  classification;  the  generalization  to  three-  and 
more-way  classifications  can  easily  be  obtained. 

In  the  notation  of  this  report  one  has  for  the  decomposition  of  the  total  sum  of 
squares  in  a  two-way  classification  with  unequal  cell  numbers  RQ, q: 


22  2(*„o„  ~  *_  )2  ■  (orthogonal  sums  of  squares) 

a/3  P  PP 

-I-  (sums  of  products). 


In  the  second  part  on  the  right  hand  side  it  is  sufficient  to  consider  the  sum 


2  22  /?_o  (jc  -  ic  ){x  a  -  x  )  . 

ap  op  •••••  '*p. 


The  basic  condition  for  this  sum  to  be  identically  zero  is  that  the  two  summations 
over  a  and  /3  can  be  performed  independently,  i.e.,  it  must  be 

Rap  m  R\  (a)  /Jjl/S)  . 

However,  XR-,  (a)  (xa  -  x  ),  say,  is  identically  zero  only  if  Ry  (a)  ■  CR »  w**h 

Q  A  Mss  MS  "  Ms 

C  “  const.,  because  of 

lRl  (a)  Go..  -  *J  -  C  2  Ra,  -  tP)  -  0  . 


Then,  from 

-CR^Rs(p) 

it  follows  that 


R.P  “  C  RnR2  W  * 
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or: 


*4 

~nr 


Finally,  therefore,  the  condition  reads: 


«-.«/ 3 

— 7C_ 


This,  in  fact,  is  a  ‘'proportionality  condition*’  because  it  is  equal  to 


*a/3  * /3  . 

-jj —  =  -jj —  t  or  in  words: 

Q«  f* 


“The  number  Rap  of  observations  in  cell  ‘aj6’  must  be  to  the  marginal  total  RQ  as  the 
marginal  total  R^q  is  to  the  total  number  R  of  all  observations.’’ 

Correspondingly,  the  conditions  for  the  three-  and  four-way  classifications  can  be 
shown  to  be,  respectively: 


R 


afiy 


Ra..R.p..K.y 
R  2 


and 


R  R  a  R  v  R  a 

jn  diet  ••P’  ¥wO 

Rapy5  =  -  - 


In  general,  one  will  have  for  the  n-way  classification: 


h^2 


J?n-1 

^...  .) 


where  <Pj,  cp2,  ...  ,  <pD  denote  the  level  subscripts  for  factors  Jj,  ?2»  ••••»  ?„»  respectively. 
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